The World Wide Web (WWW) is comprised of an expansive network of interconnected computers upon which businesses, governments, groups, and individuals throughout the world maintain inter-linked computer files known as web pages. Users navigate these pages by means of computer software programs commonly known as Internet browsers. The vastness of the unstructured WWW causes users to rely primarily on Internet search engines to retrieve information or to locate businesses. These search engines use various means to determine the relevance of a user-defined search to the information retrieved.
Typically, each search result rendered by the search engine includes a list of individual entries that have been identified by the search engine as satisfying the user's search expression. Each entry or “hit” includes a hyperlink that points to a Uniform Resource Locator (URL) location or web page. In addition to the hyperlink, certain search result pages include a short summary or abstract that describes the content of the web page.
A common technique for accessing textual materials on the Internet is by means of a “keyword” combination, generally with Boolean operators between the words or terms, where the user enters a query comprised of an alphanumeric search expression or keywords. In response to the query, the search engine sifts through available web sites to match the words of the search query to words in a metadata repository, in order to locate the requested information.
This word-match based search engine parses the metadata repository to locate a match by comparing the words of the query to indexed words of documents in the repository. If there is a word match between the query and words of one or more documents, the search engine identifies those documents and returns the search results in the form of HTML pages.
Furthermore, not only is the quantity of the WWW material increasing, but the types of digitized material are also increasing. For example, it is possible to store alphanumeric texts, data, audio recordings, pictures, photographs, drawings, images, video and prints as various types of digitized data. However, such large quantities of materials are of little value unless the desired information is readily queryable, browseable and retrievable. While certain techniques have been developed for accessing specific types of textual materials, these techniques are at best moderately adequate for accessing graphic or other specialized materials. Consequently, there are large bodies of published materials that still remain significantly underutilized.
As a result, it is becoming increasingly important to enable users to search by content and context, and not be limited to textual searches.